A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit
نویسندگان
چکیده
Abstract—A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.
منابع مشابه
Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test
Most of voice activity detection (VAD) schemes are operated in the discrete Fourier transform (DFT) domain by classifying each sound frame into speech or noise based on the DFT coefficients. These coefficients are used as features in VAD, and thus the robustness of these features has an important effect on the performance of VAD scheme. However, some shortcomings of modeling a signal in the DFT...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملNew sound Decomposition Method Applied to Granular synthesis
In the field of granular decomposition of sound, the Matching Pursuit algorithm is particularly well suited in representing signals with simple sonic entities localized in time and frequency. Our main goal here is to extend this method towards a sound decomposition on a set of arbitrary microsounds leading to a more adaptive framework.
متن کاملA fast refinement for adaptive Gaussian chirplet decomposition
The chirp function is one of the most fundamental functions in nature. Many natural events, for example, most signals encountered in seismology and the signals in radar systems, can be modeled as the superposition of short-lived chirp functions. Hence, the chirp-based signal representation, such as the Gaussian chirplet decomposition, has been an active research area in the field of signal proc...
متن کاملA four-parameter atomic decomposition of chirplets
A new four-parameter atomic decomposition of chirplets is developed for compact representation of signals with chirp components. The four-parameter atom is obtained by scaling the Gaussian function, and then applying the fractional Fourier transform (FRFT), timeshift and frequency-shift operators to the scaled Gaussian. The decomposition is realized by extending the matching pursuit algorithm t...
متن کامل